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ABSTRACT 

Wireless sensors will soon be part of many industrial, mili- 
tary, and home networks. Of the various networking proto- 
cols, which include Zigbee, ISA100, Wireless HART, 6Low- 
PAN, and others, none has yet become a definitive standard. 
Neither have vendors standardized upon a given operating 
system, compiler, or microcontroller. Users of these sensor 
networks are often given no command-line, no internal doc- 
umentation, and no access to the internals of each device. 
This paper provides a thorough introduction to reverse en- 
gineering such devices, both in hardware and in software. 

1. INTRODUCTION 

That which follows is an introduction to reverse engineer- 
in;', arid exploiting low-powered wireless embedded systems. 
such as those winch implement Zigbee and other 802.15.4- 
based LPAN protocols. Example being taken chiefly from 
the first-hand experience of the author, most involve the 
MSP430 microcontroller. The TinyOS operating system and 
Telos B hardware are used in order to make the exercises re- 
peatable, but the techniques here presented are intended for 
use on black-box devices without documentation. 

2. MSP430STATIC 

Many examples in this paper will include examples from 
TinyOS being analyzed with MSP430static 1 , a disassembly 
analysis tool by the author of this paper. TinyOS examples 
were chosen because they may be freely distributed and re- 
sults may be cross-checked with symbols, neither of which 
can be done with a firmware image found in the field. Use of 
these symbols is akin to looking in the back of a textbook; 
they are good for proving a point, but with few exceptions 
the field reverse-engineer does not have the luxury of refer- 
encing them. 



M4S, as it wi 
script which 

1 http://msp430static.sf.net/ 



be called for short, is a hastily writ 
SQLite3 database. Que 
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made using SQL with a few pre-processing macros, such as 
interpretation of hexadecimal addresses. For example, one 
might select the last five function names and entry addresses 
by "select enhex (address) , name from funcs 
order by address desc limit 5;". 

Macros allow for queries without parameters to easily be 
called; for example, ".ivt" will print the Interrupt Vector 
Table (IVT) . Scripts are similar to macros, except that their 
names are delimited by slashes (/) rather than dots (.) and 
that they exist as files on disk rather than entries within a 
macros table. 

3. SCHEMATIC DIAGRAMS 

Reverse engineering the schematic diagram of a wireless sen- 
sor is necessary, but the connections of that schematic are 
not entirely random. For example, the MSP430F1611, which 
is commonly found in wireless sensors, has only two hard- 
ware serial ports. 2 In the Telos B, the SPI bus is connected 
to the first of these (USARTO) and the FTDI USB/serial 
chip to the second (USART1). 

This is true for clock pins, hardware-assisted I/O ports, 
JTAG pins, analog I/O, and similar special-purpose pins. 
Only general purpose I/O pins (GPIO) cannot be found by 
this method, but for most chip packaging these can be traced 
using syringes and a continuity tester. 

4. ACCESSING FIRMWARE 

In order to reverse engineer a wireless sensor, it is necessary 
to have access to the firmware. There are many ways to 
accomplish this, each of varying difficulty. 



First, the firmware may often be obtained from a device by a 
JTAG adapter, as many devices do not blow the protection 
fuses. Fuses are either EEPROM cells, as in the case of 
the PIC and AVR chips, which may cleared with the rest 
of flash memory, or they are true fuses in the sense that 
they physically and irreparably blow, as in the case of the 
MSP430. Even then, the fuse may be physically reset with 
either a micro-electronic probe station or more sophisticated 
hardware. 



When invasive or semi-invasive methods of firmware extrac- 
tion are impractical, attacks on software can be used instead. 



2 The exception to this rule is the MSP430 serial bootstrap 
loader (BSL), which uses a software serial port. 




Figure 1: Memory Map of the TinyOS Blink Appli- 
cation 



For example, there exists a timing vulnerability [4] [3] in the 
password comparison routine of many versions of the serial 
bootstrap loader of the MSP430. By exploiting this vulner- 
ability, it is possible to extract firmware even from a locked 
device. Use of stack overflow exploits, to be presented in this 
context in Section 11, has also been made by the author to 
extract locked firr 



5. AUTOMATED ANALYSIS 

5.1 Mapping Memory 

One of the simplest forms of scripted analysis, shown in 
Figure 1, is a memory map. In this example, the low byte 
of the address is the horizontal position while the high byte 
is the vertical position. Origin (0x0000) is the bottom-left 
corner, OxFFOO the top-left, and OxFFFF the top-right. 

By sight, one can recognize critical features of the applica- 
tion. At OxFFEO, near the top-right, is the Interrupt Vector 
Table (IVT). In the bottom two rows (0x0000 to 0x0200), 
which are the Special Function Registers (SFR), one can 
see the I/O ports, clocks, and similar functions which are 
configured by the image. 

The red region beginning at 0x4000, the 4 x 16 = 64 th line, 
is the actual code of the application. From the IVT, listed 
in Figure 1, it can be seen that the IVT's RESET vector, 
at OxFFFE, points to the bottom of flash memory. GCC 
for the MSP430 seems to always have that behavior, in that 
the RESET handler always points to the lowest address in 
flash memory. Another indication that this image came from 
GCC is that unused interrupts all point to the same address; 
other compilers typically leave those addresses as OxFFFF, 
the default state for flash memory. 

5.2 Graphs 

A callgraph, such as is depicted i 
graph theory terms-of functions a 
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Table 1: Interrupt Vector Table for Bli 



calls as edges. The sixteen entries of the IVT, from Ta- 
ble 1, are also seen here, in the bottom right. The pattern 
of many IVT entries passing to a single handler, which then 
calls a single leaf function, will become familiar to those 
reverse engineering MSP430 firmware compiled by GCC. 

6. MICROCONTROLLERS 

6.1 Harvard and Von Neumann 

Microcontroller architectures are considerably more varied 
than those of personal computers. Register counts vary, 
as do the purposes of registers. Unlike desktop processors, 
which are all Von Neumann for practical purposes, many mi- 
crocontrollers are Harvard architecture, having one memory 
for data and another for instructions. 

Von Neumann machines have a single address range within 
which are both executable code and non-executable data. 
This model will be familiar to those without embedded expe- 
rience. A pointer is a pointer, and its result may be fetched 
as either code or data. Embedded examples include most 
implementations of ARM, PowerPC, and the MSP430. 

By contrast, Harvard machines have two address spaces: one 
for code, and another for data. Separate memories provide 
the intentional advantage of much simpler hardware design, 
as both a word of code and a word of data may be fetched 
at the same time. As will be discussed later, this also com- 
plicated stack overflows, as Harvard machines are unable to 
execute data memory. Examples include the PIC, 8051, and 
AVR microcontrollers. 

6.2 Memory Mapped I/O, IVT 

When reverse engineering a unix application, it is common 
practice to identify system calls to the kernel as a means 
of finding functions of interest. For example, it might be 
worthwhile to trap all calls to readQ and write () as a means 
of spying on the traffic of a closed-source application. In em- 
bedded systems such as wireless sensors, there is rarely a ker- 
nel to handle I/O; rather, an application will itself directly 
peek and poke I/O ports at memory-mapped addresses. 

Suppose that we are looking for the serial byte transmit 
function of a Telos B device to its CC2420 radio. As that 




Figure 2: Call Graph for Blink 
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Table 2: USARTO Special Function Registei 



sensov uses the MSP430F1611, we can use the header files 
of any popular MSP430 compiler to identify the memory- 
mapped address to which bytes are written for transmission. 
By referencing the schematic 3 or by following circuit traces, 
it can be seen that the first serial port, USARTO, connects 
to the CC2420 radio. Table 2 provides a list of USARTO's 
Special Function Registers (SFR) which I/O functions will 
reference. 

The SQL query "select name from funcs where asm 
like ' d /.0077 d /. ' ; " returns two addresses with the TinyOS 2.0 
Base-Station application, and the shorter of these two func- 
tions is the SPI write function, 'Msp430SpiNoDmaP SpiByte 

This function's disassembled code is presented in Table 3. 
Note that the parameter, which was passed through rl5, 
is copied into local variable rll, which is then copied into 
&0x0077, UOTXBUF. For practice, optimize the function by 
hand to eliminate usage of rll. 
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Table 3: Msp430SpiNoDmaP SpiByte write 



16-BIT EMBEDDED OPERATING SYS- 
TEMS 



http://www.tinyos.net/scoop/special/hardware/ 
4 An unwelcome side effect of the TinyOS macro parser, 
NesC, is that functions often have horribly complex names 
delimited by the dollar sign ($). For purposes of legibility, 
this delimiter will be presented as a space in pvose and a 
double-underscore (_) in code within this paper. 



An embedded operating system is a very different piece of 
software from a desktop operating system. In the 16-bit 
world. These operating systems rarely if ever support pre- 
emptive multitasking or kernel/user separation. The OS is 
linked into the user application, and only a single user ap- 
plication will exist on a given device. Further, as everything 
is statically linked, a reverse engineer will never see sym- 
bols within a firmware image, only within pre-distribution 
executables such as those produced directly by a compiler. 



Popular open source operating syster 
include Contiki and TinyOS, but it i; 
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8. INLINING 

It is common for embedded operating systems to optimize 
memory usage and execution speed 1 >y automatically iuliumg 
functions where appropriate. In such operating systems, a 
function will often be quite long, with many leaf functions 
contained within it. Further, these long functions will often 
require more registers than are available as scratch registers, 
leading to opening clauses of many PUSH ii 
ending clauses of many POP instructions. 

9. NETWORKING STACKS 

As has been implied by the previous s> 
pect to find the networking stack mixed with the handling 
routines in the firmware of a wireless sensor node. Lack- 
ing multiprocessing, memory protection, and even memory 
management, there is little reason not to have the radio 
chip's driver simply call a function to handle an incoming 
packet. Similarly, there is little reason not to have a function 
directly call the radio driver to send a packet. 

The exception to this can be profitable, and it involves de- 
layed processing of a packet. Suppose, for example, that in- 
stead of directly calling the handler for a packet, the driver 
posts an event for later handling. When that event is exe- 
cuted, it might read the packet and then forward the packet 
to another event, which is also delayed. Doing this keeps 
the stack depth to a minimum and it can make power man- 
agement more effective for low-packet-rate applications. 

In the case of this chained approach, it might be possible to 
start a race condition. By replaying a valid administrative 
packet, followed by an invalid one, the attacker might cause 
the former to be validated and the latter to be executed. 

In either case, it is important to identify how packets are 
being handled as, in the case of event-posting, there will 
might be no direct call path from the radio driver to the 

fund ion handler in question. 

10. SIMULATION 

A good simulator is quite handy for testing a theory or prov- 
ing a point. For the MSP430 platform, your author is rather 
font of MSP430simu, part of the MSPGCC project. For 
a detailed demonstration of modifying a simulator, see [6] , 
which describes the modification of MSP430simu to produce 
LaTeX slides of an MSP430 stack overflow for [5]. 



11. STACK OVERFLOWS 

11.1 Von Neumann 

Stack overflow exploits of Von Neumann wireless sensors are 
similar to those of an older personal computer, in that the 
stack is executable. The primary difficulty is in finding vul- 
nerable code, as will be discussed in Section 14, and in fitting 
an exploit into an 802.15.4 packet, which has a hard limit of 
128 bytes. 

The first exploit of a wireless sensor node was written by the 
author in the summer of 2007. Descriptions can be found in 

[5] and [2]. 

Bypassing this restriction is possible by recognizing making 
use of unused portions of RAM. Usually, global variables are 
placed at the bottom of RAM and a stack grows downward 
from the top of RAM. In the case of the MSP430, code in- 
jected to this region is immediately executable. Once a flash 
memory driver has been written to this region, it becomes 
possible to copy small blocks to flash memory for a larger 
executable. This can then contain a driver for external flash, 
slowly loading a larger image to external memory. Finally, 
when everything is ready, the internal firmware image can 
be moved to external flash, and an image from external flash 
loaded. If such an image were to broadcast its own injection 
routines, it would be a self-propagating worm. 

Further, as an infected sensor would have access to the keys 
of its prior life, it could infect by the transmission of en- 
crypted packets. In this way, the infection could spread 
along a web of trust, even in the case of perfect, unbreak- 
able cryptography. 

11.2 Harvard 

Infecting a Harvard sensor is considerably more conrplicated 
in that injected code may not be run directly. Instead, pre- 
existing code must be called, and the purpose of that code 
is often quite different from the intended behavior of an ex- 
While non-executable stack overflows were performed by Gu 
in [7] , the first execution of foreign code was performed by 
Francillon in [1] by use of a technique known as return- 
oriented programming. In this technique, the tail of one 
function is called, which then returns to the tail of another 
function. Multiple tails, or meta-gadgets, can then be as- 
sembled into a larger gadget. 

For example, in collaboration with Francillon, the author 
once found it necessary to place an arbitrary value, such as 
OxDEAD, into rll of an MSP430 5 , after which the machine 
was to return to the entry point of anther function, OxBEEF. 
Performing an overflow that leaves 0x47B0, OxDEAD, OxBEEF 
on the stack at the moment of a return will pop OxDEAD 
into rlf , then return to OxBEEF if the contents of Table 3 
is in memory. 

By chaining many such gadgets together, Francillon was able 

to construct a meta-gadget thai copies information from an 
arbitrary address in data memory to an arbitrary address 




5 The MSP430 is not a Harvard machine, but all examples of 
this paper stick to that platform for the sake of consistency. 



in code memory. Once there, the code can be executed di- 
rectly, making possible such things as self-propagating sen- 



12. PATCHING 

As linkers lay out code in a regular fashion, often growing 
upward from a lower boundary, the position of code within 
an executable is predictable. This linking strategy leaves 
the upper regions of flash memory, with the exception of the 
IVT, empty on the MSP430. As no elegant tool yet exists 
for injecting code into such firmware, it can be manually 
patched into this upper region. 



A nice feature of the popular Intel Hex format for n 
troller code is that each line of ASCII text is a data record, 
concluding with ":0000000f FF". By grepping this line out of 
a file as it is prepended to another, a script, can easily merge 
two images, adding a patch to an image. 

13. BUS SNIFFING 

Also worth noting is that the busses used on these devices are 
unprotected and vulnerable to sniffing and injection. Traffic 
may even be diverted by a bus extender, one that repeats 
only those messages which it wishes to let through. Figure 3 
shows an I 2 C bus being tapped in 1 



14. FUZZING 

Fuzz testing of a wireless sensor proposes a unique difficulty, 
in that the simulators for many of these platforms are lacking 
in sufficient accuracy to run an entire image. In the absence 
of such a simulator, how is the reverse engineer to identify 
and trace the cause of a crash? 

In the case of blind fuzzing, many chips revert GPIO pins to 

a higli-iuipedanei input stale and. while TinvOS does not, 
many commercial wired ; eusoi j lei urging informa- 
tion over a serial port when booting. Either of these effects 
can be used to identify which packet caused a crash, so long 
as packets are sent slowly and the device does not lock up. 



For cases in which the firmware is known by the attacker, the 
author has found it profitable to patch the image to contain a 
handler function which preserves the mosf recently received 
packet. The fuzzing then includes with a high frequency the 
address of that handler. 

15. CONCLUSION 

Several techniques have been presented for the reverse en- 
gineering and exploitation of 16-bit wireless embedded sys- 
tems. It is suggested that the reader follow the citations 
of this paper to investigate any individual topic in greater 
detail. 
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